US Daily Reports - EDA

Data source: "COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University" url: https://github.com/CSSEGISandData/COVID-19.

There are several datasets provided by Johns Hopkins University. Each has slightly different information The time series files contain all data to-date for each state and county, but only total cases for each location. Daily US files contain a richer dataset, including testing ratio, hospitalizaiton rates etc. But each file is for only one day and the data starts only in April, and no county level data is provided. The daily global files have similar data as the daily US files, but testing ratio and hospitalization rates are not available. Also, the table formats change over time. Not all the columns are avilable in earlier parts of the data.

First we will review the US Daily Reports.

Daily Global Reports

XGBoost predictive models to review predictive utility of various features

Baseline model

Baseline with lagged cases

Baseline plus Neighbor model

Lagged model with lagged neighbors

Baseline with Neighbors and Rates of Change

Baseline plus rolling averages

Baseline plus neighbors and rolling averages

Citations

Data source: "COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University" url: https://github.com/CSSEGISandData/COVID-19.

Useful links: